Comparing Distributed Termination Detection Algorithms for Modern HPC Platforms
نویسندگان
چکیده
This paper revisits distributed termination detection algorithms in the context of High-Performance Computing (HPC) applications. We introduce an efficient variant Credit Distribution Algorithm (CDA) and compare it to original algorithm (HCDA) as well its two primary competitors: Four Counters (4C) Efficient Delay-Optimal Distributed (EDOD). analyze behavior each for some simplified task-based kernels show superiority CDA terms number control messages. then implementation these over a runtime system, PaRSEC advantages limitations approach real implementation.
منابع مشابه
A taxonomy of distributed termination detection algorithms
An important problem in the ®eld of distributed systems is that of detecting the termination of a distributed computation. Distributed termination detection (DTD) is a dicult problem due to the fact that there is no simple way of gaining knowledge of the global state of the system. Of the algorithms proposed in the last 15 years, there are many similarities. We have categorized these algorithm...
متن کاملDistributed Termination Detection : General model and Algorithms
Termination detection constitutes one of the basic problems of distributed computing and many distributed algorithms have been proposed to solve it. These algorithms differ in the way they ensure consistency of the detection and in the assumptions they do concerning behaviour of channels (FIFO or not, bounded delay or asynchronous, etc). But all these algorithms consider a very simple model for...
متن کاملScaling Support Vector Machines on modern HPC platforms
Support Vector Machines (SVM) have been widely used in data-mining and Big Data applications as modern commercial databases start to attach an increasing importance to the analytic capabilities. In recent years, SVM was adapted to the field of High Performance Computing for power/performance prediction, auto-tuning, and runtime scheduling. However, even at the risk of losing prediction accuracy...
متن کاملTermination Detection of Distributed Algorithms by Graph Relabelling Systems
A unified and general scheme for detecting the termination of distributed computations is proposed. This scheme uses the encoding of distributed algorithms in form of graph rewriting systems to transform the problem of adding termination detection to a distributed computation into an operation on graph rewriting systems. Various examples are used to illustrate this approach.
متن کاملAlgorithms and Scheduling for Distributed Heterogeneous Platforms
The purpose of the PRACE RI is to provide a sustainable high-quality infrastructure for Europe that can meet the most demanding needs of European HPC user communities through the provision of user access to the most powerful HPC systems available worldwide at any given time. In tandem with access to Tier-0 systems, the PRACE-2IP project will foster the coordination between national HPC resource...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International journal of networking and computing
سال: 2022
ISSN: ['2185-2839', '2185-2847']
DOI: https://doi.org/10.15803/ijnc.12.1_26